Notes for MATH 6320: Iterative Methods for Linear Equations - Project #2¶

by Barry Daemi¶

Due date: April 22, 2022, 2:50 P.M.¶

$$\newcommand{\C}{\mathbb{C}} \newcommand{\R}{\mathbb{R}} \newcommand{\Z}{\mathbb{Z}} \newcommand{\Q}{\mathbb{Q}} \newcommand{\F}{\mathbb{F}} \newcommand{\J}{\mathbb{J}}$$

Programming exercises: Recommand using MATLAB

Coding.1 Implement a script that performs the Arnoldi decomposition

$$AV_{k}=V_{k}H_{k}+f_{k}e_{k}^{T}=V_{k+1}\bar{H}_{k}, \quad f_{k}=v_{k+1}\Vert f_{k} \Vert$$

The inputs are $A$, $v_{1}$, $k$, the outputs can be $V_{k}$, $H_{k}$, $f_{k}$, tey can also be $V_{k+1}$, $\bar{H}_{k}$.

     Solution:

     Krylov subspace methods are the most competitive iteraitve method scheme in present-day, in fact, most pertinant problems of present-day of large-scale linear systems found in video game's physic simulation to deep learning problems are solved utilizing preconditioned Krylov subspace solvers; Gutknecht(2006)[1], Heyn et al.(2012)[2]. The appplication of these type of solvers in profound, and there importance in this world of data and computation grow with each day. Nevertheless the objective of this project is to construct these algorithms for the purposes of academic exploration and development on my own part.

     To begin, a Krylov subpace is a space that is generated (spanned) by the images of a vector named $u$ under the subsequent powers of an matrix $A$ in question. This process generates a sequence of vectors,

$$\{u,Au,A^{2},\cdots,A^{m-1}u\}$$

these vectors are the linearly independnent vectors that form the Krylov subspace, which is a m-dimension linear subspace; Wikipedia: Krylov subspace[3]. This m-dimension linear subspace does not necessary equivalent in dimensions as the dimensions of matrix $A$ which formed it. Instead the dimensions of the Krylov subspace is determined by the mathematician and their aims for computation; as they might not want to solve the entire system as it may be computation expensive, or intractable, some mathematician might option to generate a smaller Krylov subspace to represent the larger-scale linear system that is matrix $A$.

     A significant note is that the maximum dimension that a Krylov subspace can take is determined by the number of distinct eigenvalues that are found in matrix $A$, and thus, the Krylov space might not be able to take on the full-dimensions of matrix $A$ as its eigenvalues are not all linearly independent. Though it is not nessary for the Krylov subspace to equil the full-dimensions of matrix $A$ to be able sovle the large-scale linear system that it represents. In fact the notion of preconditioning is to reduce the computational complexity by reducing the large-scale linear system to a more managable small-scale linear system, which we hope possesses a higher concentration of non-degenerate eigenvalues - so that the computational complexity can be reduced significant by this character property.

     As conveyed in Gutknecht(2006, pg. 4)[1], the notion behind Kyrlov subspace methods is that the concept is to generate a sequence, $\{x_{j}\}_{i=0}^{n}$, of approximate solutions that are contained in the Kryov subspace of which solves the large-scale linear system, $Ax=b$. Similar to non-stationary iterative methods (e.g. Projection methods), such as, conjugate residual, GMRES, et cetera, the objective to is to minimize the residual as,

$$\| b-A\tilde{x}\|_{A}=\min_{z\in K} \| x^{*}-z \|_{A}.$$

Since each component of the sequence, $\{x_{j}\}_{i=0}^{n}$, is contained in the Krylov subspace, $x_{n} \in K_{n+1}(A,b)$, so are the corresponding residuals contained in the Krylov subspace, $r_{n} \in K_{n+1}(A,b)$. With each iterative their exists an approximate solution, $x_{i}$, and a corresponding residual $r_{i}$, if all residuals are contained in the sequence of residuals, $\{r_{i}\}_{i=0}^{n}$ are linearly independent, then the iterative scheme possesses an finite termination property, and will thus converge in $n$ iterations to the exact solution of $r_{n}=0$ and $x_{n}=x^{*}$. This property can be attined from Cayley-Hamilton Theorem,

Cayley-Hamilton Theorem (1858, proved by Frobenius in 1878) Let $p(\lambda)=det(A-\lambda I)=\sum_{i=0}^{n} c_{i}\lambda^{i}$ be the charateristic polynomial of $A$. Then $p(A)=0$.

—Yunkai Zhou, Iterative Methods For Linear Equations, slide 111

Enhanced Cayley-Hamilton Theorem Let the minimum polynomial of $A$ be a $m(y)=\sum_{i=0}^{k} c_{i}\lambda^{i}$, where $k$ is the degree of the minimum polynomial. Then $m(A)=0$.

—Yunkai Zhou, Iterative Methods For Linear Equations, slide 112

     Now to construct Krylov space numerically, we need to import numpy as np, matplotlib.pyplot as plt, scipy as sp and pandas as pd; these are the four python packages that need to be imported for the purposes of this project. To implement an algorithm to numerically construct a Krylov subspace, we only need numpy and matplotlib.pyplot.

In [5]:
# Load library package
import numpy as np
import matplotlib.pyplot as plt
from scipy import sparse as sp
from numpy import linalg as LA
import pandas as pd
from scipy.sparse import csc_matrix
from scipy.sparse.linalg import spilu

     It is not necessary but we decided to create a python function to generate symmetric positive-definite matrices for our numerical experiments.

In [6]:
def SPD(n,m):
    A=np.random.normal(size=(n,m));
    A=np.transpose(A)@A;
    
    return A;

     To form a sequence of orthonormal basis for the Krylov space, we can implement an modified Gram-schmidt to orthonormalize the sequence of vectors.

In [7]:
def arnoldi(A,b,x0):
    m=A.shape[0]; n=A.shape[1];
    H=np.zeros((m,n))
    Q=np.zeros((m,n));
    
    ritz_values = []
    
    x0 = x0/np.linalg.norm(x0); Q[:,0] = x0;
    
    for k in range(n):
        u=A@Q[:,k];
        for j in range(k+1):
            qj=Q[:,j];
            H[j,k]=u.T@qj ;
            u=u-H[j,k]*qj;
        
        if k+1 < n:
            H[k+1,k]=np.linalg.norm(u);
            Q[:,k+1]=u/H[k+1,k];
        
        plt.spy(H)
        ritz_values.append(np.linalg.eig(H)[0])
            
    return H,Q;

     To test our Arnoldi decomposition algorithm, we generated a small-scale linear system of 5 by 5.

In [8]:
n=5; m=5;
A=SPD(n,m);
b=np.random.randn(n);
x0=np.random.randn(m);
In [9]:
[H,Q]=arnoldi(A,b,x0)
No description has been provided for this image
In [10]:
df=pd.DataFrame(H)
df
Out[10]:
0 1 2 3 4
0 10.408177 2.099063 -8.215650e-15 2.317591e-14 -2.428596e-12
1 2.099063 6.253088 2.486675e+00 1.598721e-14 -1.609804e-12
2 0.000000 2.486675 4.518148e+00 1.043731e+00 -8.710116e-13
3 0.000000 0.000000 1.043731e+00 8.722138e+00 7.872520e-02
4 0.000000 0.000000 0.000000e+00 7.872520e-02 2.040995e-02
In [8]:
df1=pd.DataFrame(Q)
df1
Out[8]:
0 1 2 3 4
0 -0.049389 0.073260 -0.715274 0.664245 0.198382
1 -0.793488 -0.343895 0.217006 0.079476 0.445764
2 0.555361 -0.130424 0.423781 0.325780 0.623571
3 -0.122738 0.800447 -0.068410 -0.317080 0.488877
4 -0.210826 0.467594 0.506973 0.588039 -0.366192
In [9]:
np.linalg.norm(Q.T @ A @ Q - H)/ np.linalg.norm(A)
Out[9]:
9.687385811771314e-15
In [10]:
np.linalg.norm(Q.T @ Q - np.eye(n))
Out[10]:
2.1695177238518212e-14

     To test our Arnoldi decomposition more vigorously, we generated a small-scale linear system of 500 by 500.

In [11]:
n=500; m=500;
A=SPD(n,m);
b=np.random.randn(n);
x0=np.random.randn(m);
In [12]:
[H,Q]=arnoldi(A,b,x0)
No description has been provided for this image
In [13]:
df2=pd.DataFrame(H)
df2
Out[13]:
0 1 2 3 4 5 6 7 8 9 ... 490 491 492 493 494 495 496 497 498 499
0 531.037382 507.137296 -1.687539e-13 1.403322e-13 2.131628e-14 -7.815970e-14 1.171285e-13 -1.492140e-13 2.273737e-13 -2.309264e-13 ... 2.031640e-08 -2.771402e-08 4.101009e-08 -2.004991e-08 1.587491e-08 -1.878837e-08 9.888035e-09 -2.243364e-08 3.678599e-08 -1.018789e-07
1 507.137296 939.239806 4.731738e+02 -5.968559e-13 8.668621e-13 -1.308287e-12 1.678657e-12 -1.968203e-12 2.415845e-12 -2.950529e-12 ... 4.567488e-08 -6.230293e-08 9.219130e-08 -4.507069e-08 3.567772e-08 -4.222091e-08 2.220163e-08 -5.034966e-08 8.255045e-08 -2.286064e-07
2 0.000000 473.173784 9.785258e+02 5.189420e+02 8.810730e-13 -9.805490e-13 1.065814e-12 -1.244116e-12 1.435962e-12 -1.785239e-12 ... 3.085969e-08 -4.209215e-08 6.228358e-08 -3.044823e-08 2.409719e-08 -2.851345e-08 1.498079e-08 -3.396009e-08 5.567123e-08 -1.541586e-07
3 0.000000 0.000000 5.189420e+02 1.077344e+03 5.458737e+02 3.481659e-13 -4.334311e-13 4.511946e-13 -4.654055e-13 5.151435e-13 ... 3.020580e-09 -4.118224e-09 6.092756e-09 -2.977479e-09 2.351729e-09 -2.780155e-09 1.449913e-09 -3.275076e-09 5.362429e-09 -1.483951e-08
4 0.000000 0.000000 0.000000e+00 5.458737e+02 1.069900e+03 5.165634e+02 3.304024e-13 -3.765876e-13 4.831691e-13 -4.867218e-13 ... -8.073380e-09 1.101306e-08 -1.629649e-08 7.967426e-09 -6.308201e-09 7.465630e-09 -3.928232e-09 8.911335e-09 -1.461233e-08 4.046830e-08
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
495 0.000000 0.000000 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 ... 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 7.745447e+00 7.385889e+00 1.598908e+00 -7.063794e-14 1.616346e-13 -6.147513e-13
496 0.000000 0.000000 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 ... 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 1.598908e+00 9.354532e+00 2.784104e+00 -2.967071e-14 6.891189e-15
497 0.000000 0.000000 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 ... 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 2.784104e+00 4.850156e+00 2.209458e+00 -2.471287e-14
498 0.000000 0.000000 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 ... 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 2.209458e+00 3.758505e+00 8.705842e-01
499 0.000000 0.000000 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 ... 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 8.705842e-01 4.875842e-01

500 rows × 500 columns

In [14]:
df3=pd.DataFrame(Q.T@Q)
df3
Out[14]:
0 1 2 3 4 5 6 7 8 9 ... 490 491 492 493 494 495 496 497 498 499
0 1.000000e+00 4.417317e-16 -9.378429e-16 1.404464e-15 -1.886994e-15 2.376369e-15 -2.946238e-15 3.626565e-15 -4.492459e-15 5.637861e-15 ... -1.518865e-12 2.067147e-12 -3.055939e-12 1.491440e-12 -1.169352e-12 1.377457e-12 -6.983069e-13 1.555071e-12 -2.533678e-12 6.992244e-12
1 4.417317e-16 1.000000e+00 6.035801e-16 -1.242219e-15 1.977972e-15 -2.706646e-15 3.435008e-15 -4.209888e-15 5.095720e-15 -6.354062e-15 ... 4.165139e-11 -5.681255e-11 8.406585e-11 -4.109708e-11 3.252746e-11 -3.849027e-11 2.022900e-11 -4.586418e-11 7.518971e-11 -2.082121e-10
2 -9.378429e-16 6.035801e-16 1.000000e+00 -1.590824e-16 6.466983e-17 2.698522e-17 -1.612203e-16 2.654333e-16 -3.366847e-16 3.584490e-16 ... 1.547963e-11 -2.111405e-11 3.124247e-11 -1.527342e-11 1.208782e-11 -1.430322e-11 7.514921e-12 -1.703560e-11 2.792678e-11 -7.733182e-11
3 1.404464e-15 -1.242219e-15 -1.590824e-16 1.000000e+00 -2.522436e-16 4.756396e-16 -7.926661e-16 9.929708e-16 -1.279208e-15 1.610800e-15 ... -7.699962e-12 1.050355e-11 -1.554258e-11 7.598790e-12 -6.016477e-12 7.120579e-12 -3.747126e-12 8.500845e-12 -1.393911e-11 3.860361e-11
4 -1.886994e-15 1.977972e-15 6.466983e-17 -2.522436e-16 1.000000e+00 -3.640405e-16 9.233583e-16 -1.406792e-15 1.996665e-15 -2.631347e-15 ... 6.014372e-12 -8.201944e-12 1.213547e-11 -5.931665e-12 4.690940e-12 -5.548778e-12 2.907381e-12 -6.582079e-12 1.078504e-11 -2.985700e-11
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
495 1.377457e-12 -3.849027e-11 -1.430322e-11 7.120579e-12 -5.548778e-12 1.842047e-11 -1.332360e-11 1.461547e-11 6.163669e-12 2.414930e-11 ... -2.119832e-15 -1.291849e-14 5.824334e-16 1.917737e-15 -1.951564e-16 1.000000e+00 1.028908e-15 -2.966377e-15 5.259790e-15 -1.520745e-14
496 -6.983069e-13 2.022900e-11 7.514921e-12 -3.747126e-12 2.907381e-12 -9.666555e-12 7.008853e-12 -7.678354e-12 -3.218289e-12 -1.267904e-11 ... 3.567242e-15 2.140161e-14 -9.897140e-16 -3.122014e-15 3.809127e-15 1.028908e-15 1.000000e+00 3.014082e-17 -2.476928e-16 9.715536e-16
497 1.555071e-12 -4.586418e-11 -1.703560e-11 8.500845e-12 -6.582079e-12 2.190075e-11 -1.589786e-11 1.740546e-11 7.273056e-12 2.873226e-11 ... -1.071972e-14 -6.430100e-14 3.018853e-15 9.306791e-15 -1.274631e-14 -2.966377e-15 3.014082e-17 1.000000e+00 -3.827234e-17 1.111090e-15
498 -2.533678e-12 7.518971e-11 2.792678e-11 -1.393911e-11 1.078504e-11 -3.589540e-11 2.606688e-11 -2.853257e-11 -1.191064e-11 -4.709555e-11 ... 1.900422e-14 1.142412e-13 -5.302209e-15 -1.649814e-14 2.312609e-14 5.259790e-15 -2.476928e-16 -3.827234e-17 1.000000e+00 1.541031e-15
499 6.992244e-12 -2.082121e-10 -7.733182e-11 3.860361e-11 -2.985700e-11 9.938682e-11 -7.218897e-11 7.900797e-11 3.296219e-11 1.304024e-10 ... -5.474961e-14 -3.299626e-13 1.517710e-14 4.771791e-14 -6.757789e-14 -1.520745e-14 9.715536e-16 1.111090e-15 1.541031e-15 1.000000e+00

500 rows × 500 columns

In [15]:
np.linalg.norm(Q.T @ A @ Q - H)/ np.linalg.norm(A)
Out[15]:
6.801528514985873e-11
In [16]:
np.linalg.norm(Q.T @ Q - np.eye(n))
Out[16]:
1.5239652052575796e-09

     One can observe the lost of numerical accuracy in the Arnoldi factorization itself, and a significiant lost in numerical accuracy in the orthogonality of the orthonormal basis of the Krylov space. (explain further).

In [ ]:
def arnoldiDGKS(A,b,x0):
    m=A.shape[0]; n=A.shape[1];
    H=np.zeros((m,n))
    Q=np.zeros((m,n));
    
    ritz_values = []
    
    x0 = x0/np.linalg.norm(x0); Q[:,0] = x0;
    
    #for k in range(n):
    #    u=A@Q[:,k];
    #    for j in range(k+1):
    #        qj=Q[:,j];
    #        H[j,k]=u.T@qj ;
    #        u=u-H[j,k]*qj;
    #    
    #    if k+1 < n:
    #        H[k+1,k]=np.linalg.norm(u);
    #        Q[:,k+1]=u/H[k+1,k];
    #    
    #    plt.spy(H)
    #    ritz_values.append(np.linalg.eig(H)[0])
        
    for j in range(m):
        w=A@Q[:,j]; nrmw=np.linalg.norm(w);
        H[0:j,j]=Q[:,0:j].T@w;
        w=w-Q[:,0:j]@H[0:j,j];
        H[j+1,j]=np.linalg.norm(w);
        if H[j+1,j] < 0.717*nrmw:
            Htmp=Q[:,0:j].T*w;
            w=w-Q[:,0:j]*Htmp;
            H[0:j,j]=H[0:j,j]+Htmp; H[j+1,j]=np.linalg.norm(w);
        Q[:,j+1]=w/H[j+1,j]
            
    return H,Q;
In [ ]:
n=5; m=5;
A=SPD(n,m);
b=np.random.randn(n);
x0=np.random.randn(m);
In [ ]:
[H,Q]=arnoldiDGKS(A,b,x0)
In [ ]:
df2=pd.DataFrame(H)
df2
In [ ]:
df3=pd.DataFrame(Q.T@Q)
df3
In [ ]:
np.linalg.norm(Q.T @ A @ Q - H)/ np.linalg.norm(A)
In [ ]:
np.linalg.norm(Q.T @ Q - np.eye(n))

Lastly

Coding.2 Implement the restart FOM and GMRES methods, you can call the Arnoldi decomposition you have implemented.
For both FOM and GMRES, you should not use "\" to solve the projected problem. Instead you should code the Givens rotation and triangular solvers to solve the projected problems for both FOM and GMRES, and use the by-product for measuring convergence without computing the residual norms. (But for debugging/testing of your code, you can use "\" to solve the projected problem, just for comparison with your non-"\" implementations.)
Note that if you set the maximum subspace dimension to tghe matrix size $n$, then the code should perform as if no restarted is used, otheriwse, your code should be able to perform a restart with $m < n$.

     Solution:

     I spend the majority of my time trying to debug this implementation of GMRES, though I was quite unsuccessful. This implementation function as expected in MATLAB, but an equivalent python implementation results in a intractable singular matrix, matrix $x$ solution or out of bounds error. Simply this implementation of GMRES is not suited for Python, and an entirely different approach is needed to implement a functioning GMRES script in Python.

In [17]:
def rotmat(a,b):
    if b == 0.0:
        c=1.0;
        s=0.0;
    elif np.abs(b) > np.abs(a):
        temp=a/b;
        s=1.0/np.sqrt(1.0+temp**2);
        c=temp*s;
    else:
        temp=b/a;
        c=1.0/np.sqrt(1.0+temp**2);
        s=temp*c;
    
    return c,s;
In [ ]:
def gmres(A,x,b,restrt,maxIter,tol):
    iter=0;
    flag=0;
    
    bnrm2=np.linalg.norm(b);
    if bnrm2 == 0.0:
        bnrm2=1;
    r=b-A@x;
    error=np.linalg.norm(r)/bnrm2;
    if error < tol:
        return x,error, iter, flag;
    m=A.shape[0]; n=A.shape[0];
    V=np.zeros((n,m+1));
    H=np.zeros((m+1,m));
    cs=np.zeros((m,1));
    sn=np.zeros((m,1));
    e1=np.zeros((n,1));
    e1[0]=1.0;
    
    for iter in range(maxIter):
        r=b-A@x;
        V[:,0]=r/np.linalg.norm(r);
        s=np.linalg.norm(r)*e1;
        for i in range(m-1):
            w=A@V[:,i];
            for k in range(i):
                H[k,i]=w.T@V[:,k];
                w=w-H[k,i]*V[:,k];
            if i+1 < n:
                H[i+1,i]=np.linalg.norm(w);
                V[:,i+1]=w/H[i+1,i];
            for k in range(i):
                temp=cs[k]*H[k,i]+sn[k]*H[k+1,i];
                H[k+1,i]=-sn[k]*H[k,i]+cs[k]*H[k+1,i];
                H[k,i]=temp;
            [cs[i],sn[i]]=rotmat(H[i,i],H[i+1,i]);
            temp=cs[i]*s[i];
            s[i+1]=-sn[i]*s[i];
            s[i]=temp;
            H[i,i]=cs[i]*H[i,i]+sn[i]*H[i+1,i];
            H[i+1,i]=0.0;
            error=np.abs(s[i+1])/bnrm2;
            if error <= tol:
                y=np.linalg.solve(H,s);
                x=x+V@y;
                break;
                
        if error <= tol:
            break;    
        y=np.linalg.solve(H,s);
        x=x+V@y;
        r=b-A@x;
        s[i+1]=np.linalg.norm(r);
        error=s[i+1]/bnrm2;
        if error <= tol:
            break;
    if error < tol:
        flag=1
        
    return x,error,iter,flag;  
In [ ]:
n=100;m=100;
A=SPD(n,m);
b=np.random.randn(n);
x0=np.random.randn(m);
restrt=900;
maxIter=5000
tol=1e-10;
In [ ]:
[x,error,iter,flag]=gmres(A,x0,b,restrt,maxIter,tol)
In [ ]:
error
In [ ]:
np.linalg.norm(b-A@x)

     After a significant time investment in the last version of implementation, I decide to derive GMRES by hand, and then implement it in Python code; the results of my toils is as follows.

In [11]:
def givensrotation(a,b):
    if b==0:
        c=1;
        s=0;
    else:
        if np.abs(b) > abs(a):
            r=a/b;
            s=1/np.sqrt(1+r**2);
            c=s*r;
        else:
            r=b/a;
            c=1/np.sqrt(1+r**2);
            s=c*r;
            
    return c,s;
In [12]:
def gmres(A,x0,tol):
    iter=0; x=x0;
    bnrm2=np.linalg.norm(b);
    if bnrm2==0.0:
        bnrm2=1;
    r=b-A@x0;
    error=np.linalg.norm(r)/bnrm2;
    
    if error < tol:
        iter=1+iter;
        return x,error,iter;
    
    m=A.shape[0]; n=A.shape[1];
    H=np.zeros((m,n))
    Q=np.zeros((m,n));
    e1=np.zeros((n,1));
    e1[m-1]=1.0;
    S=np.zeros((m,1))
    S=np.linalg.norm(r)*e1;
    
    ritz_values = [];
    
    x0 = x0/np.linalg.norm(x0); Q[:,0] = x0;
    
    
    for k in range(n):
        u=A@Q[:,k];
        for j in range(k+1):
            qj=Q[:,j];
            H[j,k]=qj @ u;
            u=u-H[j,k]*qj;
        
        if k+1 < n:
            H[k+1,k]=np.linalg.norm(u);
            Q[:,k+1]=u/H[k+1,k];
        
        plt.spy(H);
        ritz_values.append(np.linalg.eig(H)[0]);
        
    df=pd.DataFrame(H)
    print(df)    
    
    error1=np.zeros((2,1));
    I=np.eye(m);
    error1[0,0]=np.linalg.norm(Q.T@A@Q-H)/ np.linalg.norm(A);
    error1[1,0]=np.linalg.norm(Q.T@Q-np.eye(n));
    print("Factorization error = " + str(error1[0,0]) + " " + "Error of Q = " + str(error1[1,0]))
    
    V=np.eye(m);
    
    R=H;
    
    for j in range(n-1):
        for i in range(j+1,j,-1):
            G=np.eye(m);
            [c,s]=givensrotation(H[i-1,j],H[i,j]);
            G[i-1,i-1]=c; G[i-1,i]=-s; 
            G[i,i-1]=s; G[i,i]=c;
            df1=pd.DataFrame(G)
            df1
            H=G.T@H;
            V=V@G

    return H,V;
In [17]:
n=5;m=5;
A=SPD(n,m);
b=np.random.randn(n);
x0=np.random.randn(m);
restrt=900;
maxIter=5000
tol=1e-10;
In [18]:
[H,V]=gmres(A,x0,tol)
          0         1             2             3             4
0  3.014705  1.303124  6.661338e-16  2.220446e-16 -9.228729e-16
1  1.303124  4.764382  4.602523e+00  4.440892e-16 -2.595146e-15
2  0.000000  4.602523  8.681672e+00  3.860315e+00  5.440093e-15
3  0.000000  0.000000  3.860315e+00  5.178227e+00  9.347696e-01
4  0.000000  0.000000  0.000000e+00  9.347696e-01  9.009698e-01
Factorization error = 7.956301465264818e-16 Error of Q = 3.215827389821516e-15
No description has been provided for this image
In [19]:
df1=pd.DataFrame(H)
df1
Out[19]:
0 1 2 3 4
0 3.284292e+00 3.086544e+00 1.826164e+00 3.800216e-16 -1.876808e-15
1 -1.065805e-16 6.004492e+00 9.367859e+00 2.958982e+00 2.875200e-15
2 6.588413e-17 9.892409e-17 4.512763e+00 5.713627e+00 7.996220e-01
3 -5.601009e-17 -8.409835e-17 -1.244649e-16 1.090297e+00 1.021654e+00
4 9.329269e-17 1.400777e-16 2.073138e-16 8.159767e-17 4.866962e-02

     I was able to attain the hessenberg matrix, and was able to transform away the inferior diagonal of the hessenberg matrix to form an upper triangular matrix. The last major challenge in which I was not able to resolve due to running out of time was properly constructing the beta vector use to solve for the y solution of the upper triangular matrix. If I had a little more time, I would have been able to resolve this issue.

     

In [ ]:
            temp=c*S[i];
            S[i-1]=-s*S[i];
            S[i]=temp;
            error=np.abs(S[i])/bnrm2;
            if error < tol:
                y=np.linalg.solve(H,S);
                x=x+Q@y;
                break;
            
            
            
    df=pd.DataFrame(H)
    print(df)         
    
    error2=np.zeros((2,1));
    error2[0,0]=np.linalg.norm(R-(V@H));
    error2[1,0]=np.linalg.norm(I-(V.T@V));
    print("Factorization error = " + str(error2[0,0]) + " " + "Error of Orthogonal matrix = " + str(error2[1,0])); 
          
    return x,error;

     Unfortuntely I ran out of time, and was unable to write this last portion of the script to properly construct the beta vector, so to attain the approximate solution $x$.

     Due to running out of time, I was unable to implement FOM. Though Dr. Zhou did provided an efficient MATLAB implementation of GMRES.

Your code title
    
      function [x, residual, iter, mvcnt] = GMRes(A, b, m, opts);
%
% non-preconditioned GMRes linear equation solver to solve A*x = b for x
%
% Input: A -- the matrix
%        b -- the right hand side
%        m -- the allowed max dimension with restart (maybe automatically adapted)
%     opts -- options for the solver, including
%               tol -- stopping error tolerance
%             ITmax -- maximum iteration number 
%                x0 -- initial solution (if chosen)
%             x_ext -- exact solution (if available)
%
% Output: 
%        x -- found solution within ITmax iterations
% residual -- vector of residual norms for the history of iteration
%     iter -- total number of iterations used 
%    mvcnt -- number of matrix-vector products
%

%
% (-ykz, for spring semester 2022) %

DEBUG = 0;  
%DEBUG = 1;   %uncomment this line to see debugging printouts

n = size(A,1);  mvcnt = 0;
if (nargin >= 4),
    if isfield(opts,'x0'),           x0=opts.x0;        end
    if isfield(opts,'ITmax'),        ITmax=opts.ITmax;  end
    if isfield(opts,'tol'),          tol=opts.tol;      end
    if isfield(opts,'x_ext'),        x_ext=opts.x_ext;  end
end

% in case opts does not contain necessary fields, set them explicitly
if ~exist('x0'),    
    x0 = zeros(n,1);    r0 = b;
else
    r0 = b - A*x0;      mvcnt = mvcnt +1;
end
if ~exist('ITmax'), ITmax=max(1000, 2*n);       end
if ~exist('tol'),   tol = 1e-10;                end
if ~exist('print_step'), print_step=2;          end    
if exist('x_ext'),  %double check if the input x_ext is accurate or not
    err_ext = norm(A*x_ext -b);
    if err_ext > 1e-11*norm(b),
        fprintf('input exact solution not exact, err=%e,  not used for verification\n', err_ext)
        EXACT_sol = 0;
    else
        EXACT_sol = 1;
    end
end

residual = [];
normb = norm(b);
beta = norm(r0);     


for iter = 1: ITmax,

% call arnoldi to generate AV = VH + fe_m', or equivalently 
%    A * V(:,1:m) = V(:,1:m+1) * H(1:m+1,1:m)  
%
[H, V] = arnoldi(A, m, r0); 
mvcnt = mvcnt + m;

if DEBUG==1,  %when DEBUG is 1, the following LS solve takes extra time, should avoid for real computation
    ymd = beta* (H(1:m+1,1:m) \ eye(m+1,1));   % (use direct method only for testing and comparison)
    xmd = x0 + V(:,1:m)*ymd; 
end

%
% apply Givens:  to save computation for speedup, it is important NOT to form or compute a full
% size-(m+1) unitary matrix Q.  
% the multiplication of the Givens rotation should be applied only to the two rows/columns 
% that need to be updated (the rest should not need any update. this avoids a huge amount of 
% computations involving multiplication by 0's)
%
gv = eye(m+1,1);   %gv is the vector to store Q*e_1  (where Q is the product of all Givens rotators, 
                   %note: for faster computation, the Q matrix should never be formed explicitly)

% Vtmp = V;        %using another V matrix should be strictly avoided in real computation

vcol = V(:,1);     %in the loop, vcol will be repeatedlt updated to store the last column of V*Q' 
for i = 1 : m,
    tmp = sqrt(H(i,i)^2 + H(i+1,i)^2);
    c = H(i,i) / tmp;
    s = H(i+1,i) / tmp;
    htmp = H(i, i:m);
    H(i, i:m) = c*htmp + s*H(i+1, i:m);
    H(i+1, i:m) = -s'*htmp + c'*H(i+1, i:m);  %the complex-conjugate is necessary if A is complex
    gtmp = gv(i);
    gv(i) = c*gtmp + s*gv(i+1);
    gv(i+1) = -s'*gtmp + c'*gv(i+1);

    %%need to update V(:,1:m+1) to get the last column for the residual vector,
    %%but V(:,1:m) needs to be unchanged for constructing x; so a clumsy way is to use Vtmp=V 
    %%before this loop is entered, then update as the following to apply G' to Vtemp from the right:
    % vtmp = Vtmp(:,i);  
    % Vtmp(:, i) = c'*vtmp + s'*Vtmp(:, i+1);
    % Vtmp(:, i+1) = -s*vtmp + c*Vtmp(:, i+1);

    %
    %a better and much more efficient way is to use a temp vector (vcol) to store the most recent last column
    %in the matrix produce V*Q' (since computing the residual vector only needs the last column in V*Q'. and
    %the last column of V*Q' is the iterated update of the 'last' updated column of V*G' inside this loop.
    %there is no need to compute V*Q' at all, not even the need to form Q)
    %
    vcol = -s*vcol + c*V(:, i+1);  %only keeps the 'last' updated column of V*G' for each Givens rotation
end

%should use backward-solve for the upper-triangular system
ym = beta * triu_solve(triu(H(1:m,1:m)), gv(1:m));   

x = x0 + V(:,1:m)*ym;

resid = beta * abs(gv(m+1));   %resid = the true residual ||b - A*x||_2, the latter need not be computed
residual = [residual; resid];

if DEBUG==1,  
    ydiff = norm(ymd - ym);  
    true_resid = norm(A*x - b); 
    fprintf('it=%i, ydiff=%e,  resid=%e, true_resid=%e\n', iter,  ydiff, resid, true_resid)
end

if resid < tol*normb, 
    break;
else %restart 
    x0 = x;    %set the most recent solution as the next initial

    %r0 = beta*gv(m+1)*Vtmp(:,m+1); %note: r0 should not come from forming the matrix Vtmp=V*Q', 
    r0 = beta*gv(m+1)*vcol;         %r0 can be obtained from the last column of V*Q', already stored in vcol 

    betaold = beta;
    beta = resid; %update beta to be the norm of the current r0:  beta = ||r0|| = ||b-Ax||

    if DEBUG==1,  %debugging tests
        fprintf('compare r0 with true residual: || r0 - (b-Ax) || = %e\n', norm(b- A*x - r0))
        fprintf('compare beta with true |r0|:  |r0| = %e,  beta=%e\n', norm(r0), beta)
    end

    %the following is not in the standard restarted GMres, here we add this to introduce some adpativeness
    %if the residual stagnates, then automatically increase m by 10 (can skip this step by commenting it out,
    %in that case u may not to manually increase m by a re-run if the input m was too small)
    if abs(betaold - beta) < normb*tol,
        m = min(m+10, n);
        if DEBUG==1, fprintf(' --> Restart seems to stagnate, increase m to %i\n',m); end 
    end
end

end

%========================================================= function [x, err] = triu_solve(R, b) % % solve R x = b for R upper triangular %
[m, n]=size(R); x = zeros(n,1);

x(n) = b(n)/R(n,n); for i = n-1 : -1 : 1 x(i) = ( b(i) - R(i, i+1:n)x(i+1:n) )/R(i,i); end if (nargout > 1), err = norm(Rx - b); end

%========================================================= function [H, V, Error] = arnoldi(A, k, v0) % % Usage: [H, V] = arnoldi(A, k, v0) % % Input: A -- a square n by n matrix
% k -- an integer specifying how many Arnoldi steps to take, % (if k=n, a full Hessenberg decomposition is % constructed; in case arnoldi breaks down, random % vectors are used to continue the iteration.) % v0 -- (optional) initial vector % % Output: H -- if k<n, the k+1 by k Hessenberg matrix
% else, the n by n Hessenberg matrix % V -- if k<n, the n by k+1 orthogonal matrix, % else, the n by n orthogonal matrix %
% % This script constructs a k-step Arnoldi decomposition % % A * V(:,1:k) = V(:,1:k+1) * H(1:k+1,1:k)
% % using CGS with DGKS reorthogonalization. % n = size(A, 1); k = min(k, n); V = zeros(n, k+1); H = zeros(k+1, k);

if (nargin < 3), v0 = rand(n,1); end

nrmv= norm(v0); if nrmv > 0;
V(:,1) = v0/nrmv; else v0 = rand(n,1); V(:,1) = v0/norm(v0);
end

irand = 0; % count # of break downs (# of random vectors used)

for j = 1 : k

  f = A * V(:,j);
  h = V(:,1:j)'*f;
  f = f - V(:,1:j)*h;

  % DGKS reorthogonalization 
    c =  V(:,1:j)'*f;
    f = f - V(:,1:j)*c;
    H(1:j,j) = h + c;

  H(j+1,j) = norm(f);
  if  abs(H(j+1,j)) > 5e-16
      V(:,j+1) = f / H(j+1,j);  
  else              % an invariant subspace found
      H(j+1,j) = 0;
      if k < n,
          break,    % exit (even if j < k)
      else
          % continue with a new random vector as V(:,j+1),
          % always do a reorthogonalization in this case
          irand = irand + 1;
          f = rand(n,1);
          h = V(:,1:j)'*f;
          f = f - V(:,1:j)*h;
          h = V(:,1:j)'*f;
          f = f - V(:,1:j)*h;
          V(:,j+1) = f /norm(f);
      end
  end

end

if (nargout > 2), Error = norm(AV(:,1:j) - V(:,1:j+1)H) %normf = norm(f) fprintf('Inside Arnoldi, number of random vectors used = %d \n', irand) end

</code>

Coding.3 Implement the restart FOM and GMRES methods, both using preconditioner provided by the Matlab's $\text{ilu}$ command (check the $\text{ilu}$ documentation for usage, you can use the most common zero-fill $\text{ilu}$).

     Solution:

     Since I was not able to properly implement GMRES or FOM; I was not able to derive and implement the preconditioned version of these iterative schemes.

In [ ]:
# empy code block - unfortunately was not able to code the preconditioned version of GMRES and FOM
# Though in a later date, I will update this Jupyter notebook to include them. (\(0w0)/)

Coding.4 Impement the CG (e.g. conjugate gradient method) and CR (e.g. conjugate residual method). For the preconditioned versions, use right preconditioning with M-inner product, and split preconditioning. You may decide (on your own) which $M$ matrices to use, for example, you may use imcomplete Cholesky decomposition (check the $\text{ichol}$ documentation for usage).

     Solution:

     Here is our implementation of conjugate gradient method.

In [20]:
def conjgrad(A,b,x,tol):
    r=b-A@x;
    p=r; i=0;
    
    while np.linalg.norm(r) >= tol*np.linalg.norm(b):
        z=A@p; 
        gamma=r.T@r; 
        alpha=gamma/(z.T@p);
        x=x+alpha*p;
        r=r-alpha*z;
        beta=(r.T@r)/gamma;
        p=r+beta*p;
        i=i+1;
        if (i % 5) ==0:
            error=np.linalg.norm(b-A@x);
            print("Iteration = " + str(i) + ", " + "Error = " + str(error));
        
    return x
In [21]:
n=300; m=300;
A=SPD(n,m);
b=np.random.randn(n);
x=np.random.randn(m);
tol=1e-10;
In [22]:
x=conjgrad(A,b,x,tol)
Iteration = 5, Error = 329.08661196858486
Iteration = 10, Error = 103.35205011499148
Iteration = 15, Error = 60.71132300184745
Iteration = 20, Error = 33.87950206609813
Iteration = 25, Error = 21.113747370360837
Iteration = 30, Error = 21.339664850656508
Iteration = 35, Error = 17.890721450247643
Iteration = 40, Error = 12.617338938494747
Iteration = 45, Error = 11.96847041346811
Iteration = 50, Error = 10.815918772882162
Iteration = 55, Error = 10.58591252138031
Iteration = 60, Error = 10.001922857106592
Iteration = 65, Error = 8.89476530259825
Iteration = 70, Error = 8.322485601554021
Iteration = 75, Error = 9.208512681072486
Iteration = 80, Error = 11.280959870159133
Iteration = 85, Error = 10.35141488404301
Iteration = 90, Error = 9.878538754965298
Iteration = 95, Error = 12.591077477098766
Iteration = 100, Error = 11.252416429304523
Iteration = 105, Error = 10.707231886365278
Iteration = 110, Error = 8.684766498542459
Iteration = 115, Error = 10.698214828427902
Iteration = 120, Error = 9.913376003438428
Iteration = 125, Error = 10.507085545637388
Iteration = 130, Error = 15.354914419840851
Iteration = 135, Error = 15.772680131651379
Iteration = 140, Error = 16.74230924418639
Iteration = 145, Error = 22.13042544309019
Iteration = 150, Error = 19.656077727479623
Iteration = 155, Error = 21.05627772793654
Iteration = 160, Error = 15.437889690921999
Iteration = 165, Error = 12.151484773237511
Iteration = 170, Error = 14.560162686845661
Iteration = 175, Error = 12.088535189876435
Iteration = 180, Error = 9.35166189682628
Iteration = 185, Error = 8.578890289479563
Iteration = 190, Error = 7.500219622687958
Iteration = 195, Error = 7.997548594797461
Iteration = 200, Error = 8.013255218322083
Iteration = 205, Error = 5.678538300779145
Iteration = 210, Error = 5.546014920178971
Iteration = 215, Error = 5.842672436820272
Iteration = 220, Error = 6.965965769121123
Iteration = 225, Error = 7.163732287880487
Iteration = 230, Error = 6.196180189601845
Iteration = 235, Error = 4.792551385818861
Iteration = 240, Error = 7.1262484000568715
Iteration = 245, Error = 5.832209981963056
Iteration = 250, Error = 4.519430172498435
Iteration = 255, Error = 5.935717011987168
Iteration = 260, Error = 5.46999567567788
Iteration = 265, Error = 4.322404406344624
Iteration = 270, Error = 4.277957078103739
Iteration = 275, Error = 3.739801461002808
Iteration = 280, Error = 3.193081322605647
Iteration = 285, Error = 5.73971342898265
Iteration = 290, Error = 6.758945781531908
Iteration = 295, Error = 7.330807650238632
Iteration = 300, Error = 6.939740790211235
Iteration = 305, Error = 6.194992613209219
Iteration = 310, Error = 7.158406348034193
Iteration = 315, Error = 7.562202756062955
Iteration = 320, Error = 5.40592363434395
Iteration = 325, Error = 6.542282717935332
Iteration = 330, Error = 7.169677950464694
Iteration = 335, Error = 3.962111020949104
Iteration = 340, Error = 5.2348374549368115
Iteration = 345, Error = 8.073775565580316
Iteration = 350, Error = 4.759500171186901
Iteration = 355, Error = 6.729676903625017
Iteration = 360, Error = 6.482642239036664
Iteration = 365, Error = 3.7751934345963893
Iteration = 370, Error = 3.8433506260105004
Iteration = 375, Error = 7.461156204238502
Iteration = 380, Error = 7.300171532819365
Iteration = 385, Error = 6.201040630989414
Iteration = 390, Error = 10.180529421067957
Iteration = 395, Error = 8.43032781544206
Iteration = 400, Error = 6.224706430576621
Iteration = 405, Error = 3.5230343877951045
Iteration = 410, Error = 2.295941896642838
Iteration = 415, Error = 3.0109919050352785
Iteration = 420, Error = 3.6162854127313655
Iteration = 425, Error = 2.54948640247526
Iteration = 430, Error = 2.2566040888374586
Iteration = 435, Error = 1.8333958182730972
Iteration = 440, Error = 1.3974518135412186
Iteration = 445, Error = 2.027290118565887
Iteration = 450, Error = 1.0191795188613995
Iteration = 455, Error = 2.148497094368834
Iteration = 460, Error = 2.2102028376793936
Iteration = 465, Error = 1.4820361590407793
Iteration = 470, Error = 1.2779623415848673
Iteration = 475, Error = 3.119464458166037
Iteration = 480, Error = 3.2182022687576053
Iteration = 485, Error = 1.9868234627066905
Iteration = 490, Error = 1.5839441385597568
Iteration = 495, Error = 0.23998738373668946
Iteration = 500, Error = 0.7322967491678848
Iteration = 505, Error = 0.0577998929992923
Iteration = 510, Error = 0.08312568126423851
Iteration = 515, Error = 0.00315626392139457
Iteration = 520, Error = 6.617439722312629e-05
Iteration = 525, Error = 3.7197703559264957e-06
Iteration = 530, Error = 2.590056173379263e-07
Iteration = 535, Error = 7.82623121881804e-09

     Here is our implementation of conjugate residual method.

In [23]:
def conjres(A,b,x,tol):
    r=b-A@x; p=r; Ar=A@r; 
    Ap=Ar; gamma=r.T@Ar; 
    i=0;
    
    while np.sqrt(np.abs(gamma)) >= tol*np.linalg.norm(b):
        alpha=gamma/(Ap.T@Ap);
        x=x+alpha*p;
        r=r-alpha*Ap;
        Ar=A@r;
        gamma0=gamma; gamma=r.T@Ar;
        beta=gamma/gamma0;
        p=r+beta*p;
        Ap=Ar+beta*Ap;
        i=i+1;
        if (i % 5) ==0:
            error=np.linalg.norm(b-A@x);
            print("Iteration = " + str(i) + ", " + "Error = " + str(error));
        
        
    return x;
In [26]:
n=300; m=300;
A=SPD(n,m);
b=np.random.randn(n);
x=np.random.randn(m);
tol=1e-10;
In [27]:
x=conjres(A,b,x,tol)
Iteration = 5, Error = 246.61816163592079
Iteration = 10, Error = 69.25904848516981
Iteration = 15, Error = 28.294191011491186
Iteration = 20, Error = 14.966816512453699
Iteration = 25, Error = 9.50909126174186
Iteration = 30, Error = 6.105705090670038
Iteration = 35, Error = 4.510792116031515
Iteration = 40, Error = 3.8523015004611563
Iteration = 45, Error = 3.4557925005346464
Iteration = 50, Error = 3.0462070120665543
Iteration = 55, Error = 2.7554112642166446
Iteration = 60, Error = 2.523984773676935
Iteration = 65, Error = 2.3280081862661137
Iteration = 70, Error = 2.1866180967130355
Iteration = 75, Error = 2.092913163419333
Iteration = 80, Error = 1.9813775932941204
Iteration = 85, Error = 1.840988670957213
Iteration = 90, Error = 1.7104039370471609
Iteration = 95, Error = 1.6052064113836424
Iteration = 100, Error = 1.5322679538657473
Iteration = 105, Error = 1.463258878501897
Iteration = 110, Error = 1.4007655879022005
Iteration = 115, Error = 1.3375220281489015
Iteration = 120, Error = 1.261870617846448
Iteration = 125, Error = 1.207580065896284
Iteration = 130, Error = 1.17950423360681
Iteration = 135, Error = 1.1569277656445023
Iteration = 140, Error = 1.127191094360732
Iteration = 145, Error = 1.1002366826217296
Iteration = 150, Error = 1.0740930915037834
Iteration = 155, Error = 1.0547805582139962
Iteration = 160, Error = 1.030692033803644
Iteration = 165, Error = 1.0059122937117215
Iteration = 170, Error = 0.9833778156919724
Iteration = 175, Error = 0.9635532369567041
Iteration = 180, Error = 0.9401464529478809
Iteration = 185, Error = 0.9189605124494051
Iteration = 190, Error = 0.8917645546088003
Iteration = 195, Error = 0.8646936012737189
Iteration = 200, Error = 0.8330431205056137
Iteration = 205, Error = 0.8007438378548217
Iteration = 210, Error = 0.7771421427733263
Iteration = 215, Error = 0.7566808470453751
Iteration = 220, Error = 0.7382243995395192
Iteration = 225, Error = 0.7287223294052768
Iteration = 230, Error = 0.7212925178908469
Iteration = 235, Error = 0.7140557425472707
Iteration = 240, Error = 0.7069612086673008
Iteration = 245, Error = 0.700586085204167
Iteration = 250, Error = 0.6960355341461008
Iteration = 255, Error = 0.6909760465912177
Iteration = 260, Error = 0.6823923079620985
Iteration = 265, Error = 0.674698746214856
Iteration = 270, Error = 0.6673960810840052
Iteration = 275, Error = 0.6539559051769905
Iteration = 280, Error = 0.6440604691143629
Iteration = 285, Error = 0.6321487501601334
Iteration = 290, Error = 0.6221218667073166
Iteration = 295, Error = 0.6151529054048974
Iteration = 300, Error = 0.6076927808633997
Iteration = 305, Error = 0.6003258731362234
Iteration = 310, Error = 0.5929105414011019
Iteration = 315, Error = 0.5877887002284325
Iteration = 320, Error = 0.5797824158471403
Iteration = 325, Error = 0.5705507107254337
Iteration = 330, Error = 0.5649441248933335
Iteration = 335, Error = 0.5545108519946231
Iteration = 340, Error = 0.5462066558909578
Iteration = 345, Error = 0.5404048504433753
Iteration = 350, Error = 0.5352411944060003
Iteration = 355, Error = 0.5254924585950316
Iteration = 360, Error = 0.5072963155919521
Iteration = 365, Error = 0.4953793994295423
Iteration = 370, Error = 0.48917093032957737
Iteration = 375, Error = 0.4790343881146902
Iteration = 380, Error = 0.47214937886607883
Iteration = 385, Error = 0.46626669559204587
Iteration = 390, Error = 0.46012104915556273
Iteration = 395, Error = 0.4569308939933477
Iteration = 400, Error = 0.45484149831709897
Iteration = 405, Error = 0.4520701898567687
Iteration = 410, Error = 0.4497179331815779
Iteration = 415, Error = 0.4468212698313619
Iteration = 420, Error = 0.44294343893283855
Iteration = 425, Error = 0.4406609204317226
Iteration = 430, Error = 0.4393012703617329
Iteration = 435, Error = 0.4374341237973073
Iteration = 440, Error = 0.43488504771081227
Iteration = 445, Error = 0.43056551116973
Iteration = 450, Error = 0.4267116965381107
Iteration = 455, Error = 0.4085429160264582
Iteration = 460, Error = 0.3421769840869736
Iteration = 465, Error = 0.289891096894338
Iteration = 470, Error = 0.24187399564087206
Iteration = 475, Error = 0.18587717375169202
Iteration = 480, Error = 0.1044808650272657
Iteration = 485, Error = 0.07154392502450029
Iteration = 490, Error = 0.06518698297353913
Iteration = 495, Error = 0.062018480132628194
Iteration = 500, Error = 0.05769907333507009
Iteration = 505, Error = 0.04644110453476088
Iteration = 510, Error = 0.04390129124558783
Iteration = 515, Error = 0.0437685638464659
Iteration = 520, Error = 0.00834794207329096
Iteration = 525, Error = 0.0003124260834526587
Iteration = 530, Error = 7.115549881818357e-06
Iteration = 535, Error = 4.6721950949289566e-07
Iteration = 540, Error = 6.98415344662874e-08
Iteration = 545, Error = 2.094777066994407e-08
Iteration = 550, Error = 1.3097056056604988e-08
Iteration = 555, Error = 6.450841780346827e-09
Iteration = 560, Error = 3.6221201043820335e-09
Iteration = 565, Error = 2.098173989325643e-09
Iteration = 570, Error = 1.1831270140132863e-09
Iteration = 575, Error = 6.076832936005906e-10
Iteration = 580, Error = 3.673515500073521e-10

     The python package name Scipy is the predominant sparse linear algebra package available to Python users. The major limitation with this python package is it useage of proprietary data types and structure types; it is a serious handicap that prevent widespread use of it. In this case specifically, utilization of the incomplete cholesky decomposition available in Scipy resulted in a matrix with a non-convertable proprietary data type. Because the resulting precondition matrix possessed an non-convertable proprietary data type, it was unable to be used in further calculations, as numpy did not contain the support for it.

     As a consequence I had to derive and wrote my own incomplete cholesky decomposition to produce a precondition matrix for the precondition version of CG and CR methods.

In [29]:
def ichol(M):
    n=M.shape[0];
    
    for k in range(n):
        M[k,k]=np.sqrt(np.abs(M[k,k]));
        for i in range((k+1),n):
            if M[i,k]!=0:
                M[i,k]=M[i,k]/M[k,k];
        for j in range((k+1),n):
            for i in range(j,n):
                if M[i,k]!=0:
                    M[i,j]=M[i,j]-M[i,k]*M[j,k];
                    
        for i in range(n):
            for j in range((i+1),n):
                M[i,j]=0;
                
    return M;

     Generated a small 10 by 10 matrix to test incomplete cholesky decomposition.

In [30]:
n=10; m=10;
A=SPD(n,m);
df3=pd.DataFrame(A)
df3
Out[30]:
0 1 2 3 4 5 6 7 8 9
0 20.785799 -1.778591 -2.070010 6.872917 2.918517 3.767847 4.300331 -4.037706 -7.738089 7.451273
1 -1.778591 3.749765 2.301858 -2.268078 2.983920 -0.590407 1.283308 -0.143865 -2.339834 -0.705777
2 -2.070010 2.301858 6.663856 1.771233 -0.232366 1.065979 1.017149 1.072569 2.890032 -3.907837
3 6.872917 -2.268078 1.771233 10.255555 4.725299 4.074437 2.309214 0.666974 6.183271 0.379173
4 2.918517 2.983920 -0.232366 4.725299 17.507754 3.616163 3.724373 2.237717 -1.621177 -1.463696
5 3.767847 -0.590407 1.065979 4.074437 3.616163 5.140977 0.784688 -0.912350 0.079583 -3.333271
6 4.300331 1.283308 1.017149 2.309214 3.724373 0.784688 2.802046 -0.621201 -1.266005 0.337485
7 -4.037706 -0.143865 1.072569 0.666974 2.237717 -0.912350 -0.621201 3.619154 2.706780 -2.103359
8 -7.738089 -2.339834 2.890032 6.183271 -1.621177 0.079583 -1.266005 2.706780 16.016800 -2.358960
9 7.451273 -0.705777 -3.907837 0.379173 -1.463696 -3.333271 0.337485 -2.103359 -2.358960 14.606818
In [31]:
M=ichol(A)
In [32]:
df4=pd.DataFrame(M)
df4
Out[32]:
0 1 2 3 4 5 6 7 8 9
0 4.559145 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.00000
1 -0.390115 1.896728 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.00000
2 -0.454035 1.120210 2.280973 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.00000
3 1.507501 -0.885725 1.511587 2.216662 0.000000 0.000000 0.000000 0.000000 0.000000 0.00000
4 0.640146 1.704858 -0.811722 2.931119 2.222853 0.000000 0.000000 0.000000 0.000000 0.00000
5 0.826437 -0.141297 0.701233 0.741411 0.775605 1.671835 0.000000 0.000000 0.000000 0.00000
6 0.943232 0.870593 0.206124 0.607590 0.010224 -0.283979 0.813650 0.000000 0.000000 0.00000
7 -0.885628 -0.258003 0.420645 0.513248 0.936437 -0.968213 -0.300263 0.650674 0.000000 0.00000
8 -1.697268 -1.582707 1.706454 2.147647 -1.235453 -0.342169 -0.034859 -0.322178 1.165465 0.00000
9 1.634358 -0.035951 -1.370252 -0.020396 -1.575054 -1.490241 -1.579371 -0.799944 -0.024402 1.48992

     Here is the left preconditioned version of CG.

In [57]:
def PCGleft(A,b,x,tol):
    r=b-A@x; 
    M=ichol(A);
    z=np.linalg.solve(M,r);
    p=z; i=0;
    
    while np.linalg.norm(r) >= tol*np.linalg.norm(b):
        Ap=A@p;
        alpha=(r.T@z)/(Ap.T@p);
        x=x+alpha*p;
        r_temp=r;
        z_temp=z;
        r=r-(alpha*Ap);
        z=np.linalg.solve(M,r);
        beta=(r.T@z)/(r_temp.T@z_temp);
        p=z+beta*p;
        i=i+1;
        if (i % 1) ==0:
            error=np.linalg.norm(b-A@x);
            print("Iteration = " + str(i) + ", " + "Error = " + str(error));
        
    return x;
In [58]:
n=100; m=100;
A=SPD(n,m);
b=np.random.randn(n);
x=np.random.randn(m);
tol=1e-10;
In [59]:
x=PCGleft(A,b,x,tol)
Iteration = 1, Error = 1069.7565098543462

     Here is the split preconditioned version of CG.

In [62]:
def PCGsplit(A,b,x0,tol):
    r=b-A@x0; maxit=10000;
    M=ichol(A);
    z=np.linalg.solve(M,r);
    gamma=(z.T@r)/(z.T@A@z);
    x=gamma*z+x0; w=1; i=0;
    xTemp0=x0;
    
    while np.linalg.norm(r) >= tol*np.linalg.norm(b):
        r=b-A@x;
        zTemp=z; xTemp1=x;
        z=np.linalg.solve(M,r);
        gammaTemp=gamma;
        gamma=(z.T@r)/(z.T@A@z);
        w=(1-((gamma/gammaTemp)*((z.T@r)/(zTemp.T@M@zTemp))*(1/w)));
        v=gamma*z+xTemp1-xTemp0;
        u=w*v;
        x=xTemp0+u;
        xTemp0=xTemp1;
        xTemp1=x;
        i=i+1;
        if (i % 1) ==0:
            error=np.linalg.norm(b-A@x);
            print("Iteration = " + str(i) + ", " + "Error = " + str(error));
        if i == maxit:
            break;
        
    return x;
In [65]:
n=300; m=300;
A=SPD(n,m);
b=np.random.randn(n);
x=np.random.randn(m);
tol=1e-10;
In [66]:
x=PCGsplit(A,b,x,tol)
Iteration = 1, Error = 298.3361529207821
Iteration = 2, Error = 204.72944231038954
Iteration = 3, Error = 129.26335027890008
Iteration = 4, Error = 161.4620031479076
Iteration = 5, Error = 851.1707397473474
Iteration = 6, Error = 900.7922608434445
Iteration = 7, Error = 129.24292237539856
Iteration = 8, Error = 24.5120420226455
Iteration = 9, Error = 4.26251738456976
Iteration = 10, Error = 0.8593802455085187
Iteration = 11, Error = 0.16015346891559123
Iteration = 12, Error = 0.03476858399231389
Iteration = 13, Error = 0.007016263020111355
Iteration = 14, Error = 0.0016601527157812893
Iteration = 15, Error = 0.00036793485293862423
Iteration = 16, Error = 9.64839156547177e-05
Iteration = 17, Error = 2.3959320163588707e-05
Iteration = 18, Error = 7.135183205324447e-06
Iteration = 19, Error = 2.0466076794224895e-06
Iteration = 20, Error = 7.196314063490302e-07
Iteration = 21, Error = 2.510088677011322e-07
Iteration = 22, Error = 1.1188219149093892e-07
Iteration = 23, Error = 5.2668472089858884e-08
Iteration = 24, Error = 3.517921356070309e-08
Iteration = 25, Error = 3.05704617716336e-08
Iteration = 26, Error = 7.098783029811836e-08
Iteration = 27, Error = 1.4444289350148972e-07
Iteration = 28, Error = 5.755822056095987e-08
Iteration = 29, Error = 1.0813612907861421e-07
Iteration = 30, Error = 9.061729961677658e-07
Iteration = 31, Error = 4.593887419163355e-07
Iteration = 32, Error = 4.975117302353126e-08
Iteration = 33, Error = 5.084889559990307e-09
Iteration = 34, Error = 5.89201182119404e-10
Iteration = 35, Error = 6.161614422285384e-11

Coding.5 Implement the CGS (e.g. conjugate gradient squared method) and BICGSTAB (e.g. Biconjugate Gradient Stabilized method), both should be the non-precondition versions of these methods, and following the implementation of the non-precondition versions of these methods, then implement the precondition versions: note, preconditioned by zero-fill $\text{ilu}$.

     Solution:

     Here is an unstable implementation of CGS.

In [72]:
def cgs(A,b,x,tol):
    
    flag=0; maxit=10000;
    bnrm2=np.linalg.norm(b);
    if bnrm2==0.0:
        bnrm2=1;
    r=b-A@x;
    error=np.linalg.norm(r)/bnrm2;
    if error < tol:
        return x;
    r_tld=r;
    
    for iter in range(maxit):
        rho=(r_tld.T@r);
        if rho == 0.0:
            print("error: rho became zero, so the algorithm was stopped.")
            break;
        if iter > 0:
            beta=rho/rho_1;
            u=r+beta*q;
            p=r+beta*(q+beta*p);
        else:
            u=r;
            p=u;
            
        p_hat=p
        v_hat=A@p_hat;
        alpha=rho/(r_tld.T@v_hat);
        q=u-(alpha*v_hat);
        u_hat=(u+q);
        x=x+(alpha*u_hat);
        r=r-(alpha*A@u_hat);
        error=np.linalg.norm(r)/bnrm2;
        if error < tol:
            break;
        rho_1=rho;
        if (iter % 10) ==0:
            error=np.linalg.norm(b-A@x);
            print("Iteration = " + str(iter) + ", " + "Error = " + str(error));
        
    if error < tol:
        flog=1;
    elif rho==0.0:
        flag=-1;
    else:
        flag=0;
        
    return x;
        
In [68]:
n=10; m=10;
A=SPD(n,m);
b=np.random.randn(n);
x=np.random.randn(m);
tol=1e-10;
In [69]:
x=cgs(A,b,x,tol)
Iteration = 0, Error = 10.53494558731843
Iteration = 10, Error = 1.7256271557634575
Iteration = 20, Error = 1.1571307442229524
Iteration = 30, Error = 2.806251621610245
Iteration = 40, Error = 1.9752424262087536
Iteration = 50, Error = 0.5632942615071121
Iteration = 60, Error = 0.0919493215015062
Iteration = 70, Error = 0.05923156291597769
Iteration = 80, Error = 0.059228578366798616
Iteration = 90, Error = 0.05922857294303022
Iteration = 100, Error = 0.0592285729547866
Iteration = 110, Error = 0.059228572954784106
error: rho became zero, so the algorithm was stopped.
In [70]:
x=cgs(A,b,x,tol)
Iteration = 0, Error = 3.432853910643634
Iteration = 10, Error = 1.6840695671647112
Iteration = 20, Error = 0.6830353535660242
Iteration = 30, Error = 1.0260904218716613
Iteration = 40, Error = 1.469067596100978
Iteration = 50, Error = 1.340208709268638
Iteration = 60, Error = 0.9616990217355383
Iteration = 70, Error = 0.6741012717097074
Iteration = 80, Error = 0.4659188857745712
Iteration = 90, Error = 0.23859493372831825
Iteration = 100, Error = 0.04345457009945859
Iteration = 110, Error = 0.007825236985434384
Iteration = 120, Error = 0.007817271636537481
Iteration = 130, Error = 0.007817282194387173
Iteration = 140, Error = 0.007817282199430684
Iteration = 150, Error = 0.007817282199430986
error: rho became zero, so the algorithm was stopped.
In [71]:
x=cgs(A,b,x,tol)
Iteration = 0, Error = 0.001498347896481122
Iteration = 10, Error = 0.004526851341240384
Iteration = 20, Error = 0.00715552334929545
Iteration = 30, Error = 0.0051169262051169425
Iteration = 40, Error = 0.0031531279502610963
Iteration = 50, Error = 0.0015546659773898494
Iteration = 60, Error = 0.000474116538656269
Iteration = 70, Error = 0.00022650372809981983
Iteration = 80, Error = 0.00022645213347779593
Iteration = 90, Error = 0.00022645213603078845
Iteration = 100, Error = 0.00022645213602885664
Iteration = 110, Error = 0.00022645213602885664
error: rho became zero, so the algorithm was stopped.
In [73]:
x=cgs(A,b,x,tol)
Iteration = 0, Error = 0.013154379694558951
Iteration = 10, Error = 0.011873537825198401
Iteration = 20, Error = 0.007486715327086097
Iteration = 30, Error = 0.003321499354919438
Iteration = 40, Error = 0.0027402129346744044
Iteration = 50, Error = 0.001415301576050847
Iteration = 60, Error = 0.00024487302897238086
Iteration = 70, Error = 3.936952223735394e-05
Iteration = 80, Error = 3.9162152879845274e-05
Iteration = 90, Error = 3.91623235125356e-05
Iteration = 100, Error = 3.916232365415987e-05
error: rho became zero, so the algorithm was stopped.
In [74]:
x=cgs(A,b,x,tol)
Iteration = 0, Error = 7.6163377620116575e-06
Iteration = 10, Error = 2.0878106170527924e-05
Iteration = 20, Error = 2.0560523301000405e-05
Iteration = 30, Error = 1.4759820547850368e-05
Iteration = 40, Error = 1.0020957190542835e-05
Iteration = 50, Error = 5.339935125692039e-06
Iteration = 60, Error = 2.1469075741725418e-06
Iteration = 70, Error = 7.471636801698677e-07
Iteration = 80, Error = 7.268370041023725e-07
Iteration = 90, Error = 7.268358034818111e-07
Iteration = 100, Error = 7.268358038382663e-07
Iteration = 110, Error = 7.268358038382663e-07
error: rho became zero, so the algorithm was stopped.

     Here is the a stable implementation of CGS.

In [81]:
def conjgs(A,b,x,tol):
    i=0;
    r=b-A@x; r_tld=r;
    u=r; p=u; 
    while np.linalg.norm(r) > tol:
        Ap=A@p;
        alpha=(r.T@r_tld)/(Ap.T@r_tld);
        q=u-alpha*Ap;
        x=x+(alpha*(u+q));
        r_temp=r;
        r=r-alpha*A@(u+q);
        if (i % 10) ==0:
            error=np.linalg.norm(b-A@x);
            print("Iteration = " + str(i) + ", " + "Error = " + str(error));
        if np.linalg.norm(r) < tol:
            break;
        beta=(r.T@r_tld)/(r_temp.T@r_tld);
        u=r+beta*q;
        p=u+beta*(q+beta*p);
        i=i+1;
        
    return x;
    
In [82]:
n=500; m=500;
A=SPD(n,m);
b=np.random.randn(n);
x=np.random.randn(m);
tol=1e-10;
In [83]:
x=conjgs(A,b,x,tol)
Iteration = 0, Error = 2871.8533561406994
Iteration = 10, Error = 31.36421580436442
Iteration = 20, Error = 9.078649490978263
Iteration = 30, Error = 4.913006953442978
Iteration = 40, Error = 3.3210054551135
Iteration = 50, Error = 2.414355248071721
Iteration = 60, Error = 1.8890828955825525
Iteration = 70, Error = 1.541811651131636
Iteration = 80, Error = 1.3234133107700905
Iteration = 90, Error = 31.865414905153628
Iteration = 100, Error = 7.431167764037988
Iteration = 110, Error = 1.1056857492886387
Iteration = 120, Error = 1.0552903231302382
Iteration = 130, Error = 10.729537663980441
Iteration = 140, Error = 0.9191983637280811
Iteration = 150, Error = 0.9132605941981181
Iteration = 160, Error = 0.942138024271691
Iteration = 170, Error = 1.0959846574630983
Iteration = 180, Error = 1.1050147019382746
Iteration = 190, Error = 1.1379793941140885
Iteration = 200, Error = 1.2720233300429067
Iteration = 210, Error = 1.2387998470163006
Iteration = 220, Error = 1.6601489111954042
Iteration = 230, Error = 1.8626799788714932
Iteration = 240, Error = 54.450029099465134
Iteration = 250, Error = 276.3395241717533
Iteration = 260, Error = 2.691902617976994
Iteration = 270, Error = 1.8734484015103146
Iteration = 280, Error = 684.5770560555
Iteration = 290, Error = 5.191293494161494
Iteration = 300, Error = 2.41509250673755
Iteration = 310, Error = 1.6729301795544076
Iteration = 320, Error = 2.379195165598295
Iteration = 330, Error = 6.542402713935021
Iteration = 340, Error = 1.5302975965368781
Iteration = 350, Error = 2.263994113952519
Iteration = 360, Error = 1.0872366516522045
Iteration = 370, Error = 0.8279058713020214
Iteration = 380, Error = 0.9043255420408243
Iteration = 390, Error = 0.709887072866221
Iteration = 400, Error = 0.6492440381893032
Iteration = 410, Error = 22.12014251881592
Iteration = 420, Error = 0.5882298689752052
Iteration = 430, Error = 0.5778620399878075
Iteration = 440, Error = 0.6987164500508306
Iteration = 450, Error = 0.605389538011253
Iteration = 460, Error = 0.5548502116316032
Iteration = 470, Error = 0.6421820597227693
Iteration = 480, Error = 0.7871969836558265
Iteration = 490, Error = 0.7916684521900951
Iteration = 500, Error = 2.124489290429589
Iteration = 510, Error = 4374.639848521565
Iteration = 520, Error = 1.496258597916121
Iteration = 530, Error = 1.6388036619007234
Iteration = 540, Error = 2.2307579736401357
Iteration = 550, Error = 2.4413920486461174
Iteration = 560, Error = 2.8361004168138852
Iteration = 570, Error = 4.12285507733666
Iteration = 580, Error = 68.16686871100134
Iteration = 590, Error = 6.521616436129132
Iteration = 600, Error = 79.39204736008958
Iteration = 610, Error = 45.354679347856354
Iteration = 620, Error = 72.7747135219278
Iteration = 630, Error = 5056.239354008007
Iteration = 640, Error = 43.49933771241108
Iteration = 650, Error = 65.21997700752912
Iteration = 660, Error = 77.39613682709816
Iteration = 670, Error = 24501.91741179567
Iteration = 680, Error = 978.0054301946936
Iteration = 690, Error = 815.6746884046746
Iteration = 700, Error = 8724.516166719017
Iteration = 710, Error = 764.6496208281809
Iteration = 720, Error = 864.0988962190866
Iteration = 730, Error = 1238.9753043573814
Iteration = 740, Error = 1281.7077012205693
Iteration = 750, Error = 1648.2111942917113
Iteration = 760, Error = 885.8360052078173
Iteration = 770, Error = 932.6003236033989
Iteration = 780, Error = 686.9259970349867
Iteration = 790, Error = 1514.615514209715
Iteration = 800, Error = 553.7442305945425
Iteration = 810, Error = 479.3901650555783
Iteration = 820, Error = 3006.243504547265
Iteration = 830, Error = 349162.13597954356
Iteration = 840, Error = 345.78891344284597
Iteration = 850, Error = 305.0490871867664
Iteration = 860, Error = 290.2504366579676
Iteration = 870, Error = 277.11060502547184
Iteration = 880, Error = 1178.7604825260398
Iteration = 890, Error = 1110.0125890768659
Iteration = 900, Error = 271.47469415093667
Iteration = 910, Error = 847.5085128593391
Iteration = 920, Error = 1550.1984352769516
Iteration = 930, Error = 472.5121860470033
Iteration = 940, Error = 510.16419947003294
Iteration = 950, Error = 6.400213402899035
Iteration = 960, Error = 3.376172443446102
Iteration = 970, Error = 273.78730097487227
Iteration = 980, Error = 10.383891089882962
Iteration = 990, Error = 6.308519187799503
Iteration = 1000, Error = 5.506099515352659
Iteration = 1010, Error = 0.3706205564623151
Iteration = 1020, Error = 0.020466678802695263
Iteration = 1030, Error = 0.018638563180223642
Iteration = 1040, Error = 0.0036157174393165875
Iteration = 1050, Error = 0.0010072135677585644
Iteration = 1060, Error = 0.0008475938515412747
Iteration = 1070, Error = 0.0007484653633967388
Iteration = 1080, Error = 0.0003498620996699288
Iteration = 1090, Error = 0.00010896478755981692
Iteration = 1100, Error = 0.0007314038747101037
Iteration = 1110, Error = 0.004809618854439786
Iteration = 1120, Error = 1.8539717371558007e-05
Iteration = 1130, Error = 0.0005117893425112399
Iteration = 1140, Error = 0.0014182039685604897
Iteration = 1150, Error = 0.0009370521874243284
Iteration = 1160, Error = 0.1788175292433678
Iteration = 1170, Error = 0.11743185244486795
Iteration = 1180, Error = 7.144582193302272e-07
Iteration = 1190, Error = 4.235957250424369e-06
Iteration = 1200, Error = 8.392211638232312e-05
Iteration = 1210, Error = 9.591936017633436e-06
Iteration = 1220, Error = 0.0015201205626032205
Iteration = 1230, Error = 0.00020665803136557528
Iteration = 1240, Error = 0.0005136073042430838
Iteration = 1250, Error = 0.00011223080372260021
Iteration = 1260, Error = 0.0002363492571607938
Iteration = 1270, Error = 0.0002466277708989337
Iteration = 1280, Error = 0.00017014215244989868
Iteration = 1290, Error = 7.448198287658125e-05
Iteration = 1300, Error = 0.0003061536056816067
Iteration = 1310, Error = 0.0001455331877065224
Iteration = 1320, Error = 8.075993402110253e-05
Iteration = 1330, Error = 1.1660678882556112e-06
Iteration = 1340, Error = 3.5640549130418813e-06
Iteration = 1350, Error = 9.036478872261653e-06
Iteration = 1360, Error = 2.2185085606806796e-06
Iteration = 1370, Error = 9.083163245060318e-07
Iteration = 1380, Error = 4.292628187531542e-06
Iteration = 1390, Error = 0.0027149697758966306
Iteration = 1400, Error = 7.608449926350945e-08
Iteration = 1410, Error = 1.8323699477593928e-08
Iteration = 1420, Error = 4.610103705923063e-06
Iteration = 1430, Error = 5.523061363113384e-05
Iteration = 1440, Error = 1.645244008882812e-07
Iteration = 1450, Error = 1.2028648834917916e-07
Iteration = 1460, Error = 1.524820825767596e-08
Iteration = 1470, Error = 2.332083354341209e-07
Iteration = 1480, Error = 1.975158599168492e-08
Iteration = 1490, Error = 2.4005445899167873e-08
Iteration = 1500, Error = 3.926302299692695e-08
Iteration = 1510, Error = 1.7569269712745813e-08
Iteration = 1520, Error = 2.1292650388105363e-08
Iteration = 1530, Error = 2.9877728478169306e-07
Iteration = 1540, Error = 2.4003492867959115e-07
Iteration = 1550, Error = 3.1087678621854516e-08
Iteration = 1560, Error = 1.6006999206604675e-08
Iteration = 1570, Error = 2.1828147470726827e-08
Iteration = 1580, Error = 3.501216199070098e-08
Iteration = 1590, Error = 5.125302912245167e-07
Iteration = 1600, Error = 8.139377305195574e-07
Iteration = 1610, Error = 2.749076187159222e-08
Iteration = 1620, Error = 2.3779271173651045e-07
Iteration = 1630, Error = 6.407962283501196e-08
Iteration = 1640, Error = 4.275273158720547e-07
Iteration = 1650, Error = 1.9350013702377336e-07
Iteration = 1660, Error = 6.410810070950382e-06
Iteration = 1670, Error = 5.526025196460061e-06
Iteration = 1680, Error = 5.604694657905412e-05
Iteration = 1690, Error = 8.395282144398153e-05
Iteration = 1700, Error = 0.00034981002154852727
Iteration = 1710, Error = 0.00027369223417238723
Iteration = 1720, Error = 0.0001009482349792831
Iteration = 1730, Error = 0.00010886097802131508
Iteration = 1740, Error = 2.8234473579397822e-05
Iteration = 1750, Error = 2.52410249863342e-05
Iteration = 1760, Error = 0.00033399234656777315
Iteration = 1770, Error = 0.10243933109402723
Iteration = 1780, Error = 0.00020298352404806612
Iteration = 1790, Error = 0.0006365648517026071
Iteration = 1800, Error = 0.0026217311698021536
Iteration = 1810, Error = 0.0017766134542645485
Iteration = 1820, Error = 0.002472384894721338
Iteration = 1830, Error = 0.002111409263232113
Iteration = 1840, Error = 0.06096739283717649
Iteration = 1850, Error = 0.00816382278602135
Iteration = 1860, Error = 0.012403554807568436
Iteration = 1870, Error = 0.0015062529589873505
Iteration = 1880, Error = 0.002483955371052914
Iteration = 1890, Error = 0.0018544088095495666
Iteration = 1900, Error = 0.0008625088401928971
Iteration = 1910, Error = 0.0008573380971639266
Iteration = 1920, Error = 0.00026209466197798945
Iteration = 1930, Error = 0.0014744134228868728
Iteration = 1940, Error = 9.905579772436665e-06
Iteration = 1950, Error = 8.228321707729318e-05
Iteration = 1960, Error = 0.00013410423216279129
Iteration = 1970, Error = 1.6399987169313358e-05
Iteration = 1980, Error = 0.00010123835796733025
Iteration = 1990, Error = 9.831808114628423e-06
Iteration = 2000, Error = 0.004920460224620118
Iteration = 2010, Error = 3.546494355032547e-06
Iteration = 2020, Error = 1.6826372240557554e-06
Iteration = 2030, Error = 1.936899817074765e-06
Iteration = 2040, Error = 2.0234266401886987e-06
Iteration = 2050, Error = 3.2425545591044164e-06
Iteration = 2060, Error = 2.6327212026510315e-06
Iteration = 2070, Error = 3.1515153755561133e-06
Iteration = 2080, Error = 4.596431033524252e-05
Iteration = 2090, Error = 0.00022213578533486698
Iteration = 2100, Error = 0.0009578063449814345
Iteration = 2110, Error = 1.0264829629445676e-06
Iteration = 2120, Error = 1.2380972037575447e-07
Iteration = 2130, Error = 1.7165573670359022e-06
Iteration = 2140, Error = 3.79851788607805e-06
Iteration = 2150, Error = 2.085827929735561e-07
Iteration = 2160, Error = 4.763201223300193e-07
Iteration = 2170, Error = 4.316055885049705e-07
Iteration = 2180, Error = 2.489370859762477e-08
Iteration = 2190, Error = 5.007339419816711e-08
Iteration = 2200, Error = 1.784751350767656e-08
Iteration = 2210, Error = 1.7852559701651874e-08
Iteration = 2220, Error = 1.7993337760719194e-08
Iteration = 2230, Error = 1.8005084751082344e-08
Iteration = 2240, Error = 4.877649557268009e-08
Iteration = 2250, Error = 1.8578089641745266e-08

     Here is the preconditioned stable implementation of CGS.

In [84]:
def Pconjgs(A,b,x,tol):
    i=0;
    M=ichol(A);
    r=b-A@x; r_tld=r;
    u=r; p=u;
    error=np.linalg.norm(b-A@x);
    print("Iteration = " + str(i) + ", " + "Error = " + str(error));
    while np.linalg.norm(r) > tol:
        p=np.linalg.solve(M,p);
        Ap=A@p;
        alpha=(r.T@r_tld)/(Ap.T@r_tld);
        q=u-alpha*Ap;
        u_hat=np.linalg.solve(A,(u+q));
        x=x+(alpha*u_hat);
        r_temp=r;
        r=r-alpha*A@u_hat;
        if (i % 1) ==0:
            error=np.linalg.norm(b-A@x);
            print("Iteration = " + str(i+1) + ", " + "Error = " + str(error));
        if np.linalg.norm(r) < tol:
            break;
        beta=(r.T@r_tld)/(r_temp.T@r_tld);
        u=r+beta*q;
        p=u+beta*(q+beta*p);
        i=i+1;
        
    return x;
    
In [85]:
n=500; m=500;
A=SPD(n,m);
b=np.random.randn(n);
x=np.random.randn(m);
tol=1e-10;
In [86]:
x=Pconjgs(A,b,x,tol)
Iteration = 0, Error = 535.9690538939683
Iteration = 1, Error = 3.989678650926232e-13

     Here is Biconjugate gradient stable method.

In [138]:
def BICGStab(A,b,x,tol):
    i=0;
    r=b-A@x; r_tld=r;
    p=r;
    error=np.linalg.norm(r);
    print("Iteration = " + str(i) + ", " + "Error = " + str(error));
    while np.linalg.norm(r) > tol:
        Ap=A@p;
        alpha=(r.T@r_tld)/(Ap.T@r_tld);
        s=r-alpha*Ap;
        As=A@s;
        w=(As.T@s)/(As.T@As);
        x=x+alpha*p+w*s;
        r=s-w*As;
        if (i % 100) ==0:
            error=np.linalg.norm(r);
            print("Iteration = " + str(i+200) + ", " + "Error = " + str(error));
        if np.linalg.norm(r) < tol:
            break;
        beta=(alpha/w)*((r.T@r_tld)/(r.T@r_tld));
        p=r+(beta*(p-w*Ap));
        i=i+1;
        
    return x;
In [139]:
n=10; m=10;
A=SPD(n,m);
b=np.random.randn(m);
x=np.random.randn(m);
tol=1e-10;
In [140]:
x=BICGStab(A,b,x,tol)
Iteration = 0, Error = 78.24071092333004
Iteration = 200, Error = 5.874309359633821
Iteration = 300, Error = 0.47475361274814226
Iteration = 400, Error = 0.10885243400109348
Iteration = 500, Error = 97.97629281429144
Iteration = 600, Error = 2.3646303299740654
Iteration = 700, Error = 255.57337995232626
Iteration = 800, Error = 1721.1678957916708
Iteration = 900, Error = 13356.32869331797
Iteration = 1000, Error = 232.19900817148823
Iteration = 1100, Error = 16.42767335754242
Iteration = 1200, Error = 1.0585325393049663
Iteration = 1300, Error = 0.583916131018741
Iteration = 1400, Error = 0.04131735151077903
Iteration = 1500, Error = 0.02999332970571374
Iteration = 1600, Error = 2.1380670430141873
Iteration = 1700, Error = 24.370666346844352
Iteration = 1800, Error = 4.940168000340099
Iteration = 1900, Error = 0.7267465503736613
Iteration = 2000, Error = 0.11091789979900095
Iteration = 2100, Error = 1.7739516716745949
Iteration = 2200, Error = 0.5461805325965746
Iteration = 2300, Error = 0.2865959838148986
Iteration = 2400, Error = 1.1934899449410343
Iteration = 2500, Error = 0.031804049525613935
Iteration = 2600, Error = 0.0045880870373940135
Iteration = 2700, Error = 0.009052848038424254
Iteration = 2800, Error = 6.872249948263537e-05
Iteration = 2900, Error = 0.0022283991647527365
Iteration = 3000, Error = 8.914066012822276e-05
Iteration = 3100, Error = 7.796994909237536e-06
Iteration = 3200, Error = 2.758942480377419e-07
Iteration = 3300, Error = 3.991373317992016e-09
Iteration = 3400, Error = 9.766761475095392e-09
Iteration = 3500, Error = 4.117043930569576e-09
Iteration = 3600, Error = 1.325082220185301e-10

     Here is preconditioned Biconjugate gradient stable method.

In [94]:
def PBICGStab(A,b,x,tol):
    i=0;
    M=ichol(A);
    r=b-A@x; r_tld=r;
    p=r;
    error=np.linalg.norm(r);
    print("Iteration = " + str(i) + ", " + "Error = " + str(error));
    while np.linalg.norm(r) > tol:
        p=np.linalg.solve(M,p);
        Ap=A@p;
        alpha=(r.T@r_tld)/(Ap.T@r_tld);
        s=r-alpha*Ap;
        s=np.linalg.solve(M,s);
        As=A@s;
        w=(As.T@s)/(As.T@As);
        x=x+alpha*p+w*s;
        r=s-w*As;
        if (i % 1) ==0:
            error=np.linalg.norm(r);
            print("Iteration = " + str(i+1) + ", " + "Error = " + str(error));
        if np.linalg.norm(r) < tol:
            break;
        beta=(alpha/w)*((r.T@r_tld)/(r.T@r_tld));
        p=r+(beta*(p-w*Ap));
        i=i+1;
        
    return x;
In [95]:
n=500; m=500;
A=SPD(n,m);
b=np.random.randn(n);
x=np.random.randn(m);
tol=1e-10;
In [96]:
x=PBICGStab(A,b,x,tol)
Iteration = 0, Error = 520.820626322756
Iteration = 1, Error = 5.163480703387725e-13

Reference Page¶

1. https://people.math.ethz.ch/~mhg/pub/biksm.pdf

2. https://www.mcs.anl.gov/papers/P2099-0612.pdf

3. https://en.wikipedia.org/wiki/Krylov_subspace

And lastly a art piece of Kromia!

best_neko_friends.png